Linear array of photodiodes to track a human speaker for video recording
نویسندگان
چکیده
Communication and collaboration using stored digital media has garnered more interest by many areas of business, government and education in recent years. This is due primarily to improvements in the quality of cameras and speed of computers. An advantage of digital media is that it can serve as an effective alternative when physical interaction is not possible. Video recordings that allow for viewers to discern a presenter’s facial features, lips and hand motions are more effective than videos that do not. To attain this, one must maintain a video capture in which the speaker occupies a significant portion of the captured pixels. However, camera operators are costly, and often do an imperfect job of tracking presenters in unrehearsed situations. This creates motivation for a robust, automated system that directs a video camera to follow a presenter as he or she walks anywhere in the front of a lecture hall or large conference room. Such a system is presented. The system consists of a commercial, off-the-shelf pan/tilt/zoom (PTZ) color video camera, a necklace of infrared LEDs and a linear photodiode array detector. Electronic output from the photodiode array is processed to generate the location of the LED necklace, which is worn by a human speaker. The computer controls the video camera movements to record video of the speaker. The speaker’s vertical position and depth are assumed to remain relatively constant– the video camera is sent only panning (horizontal) movement commands. The LED necklace is flashed at 70Hz at a 50% duty cycle to provide noise-filtering capability. The benefit to using a photodiode array versus a standard video camera is its higher frame rate (4kHz vs. 60Hz). The higher frame rate allows for the filtering of infrared noise such as sunlight and indoor lighting–a capability absent from other tracking technologies. The system has been tested in a large lecture hall and is shown to be effective.
منابع مشابه
Acceleration-Based Quality Assessment of Railway Tracks using a 2D simulation model and recorded track data
Car body acceleration is an important factor affecting track safety and ride comfort, which are two primary aspects of railway systems. Though track level is an important source of wagon body acceleration, no quantitative relation between them is available and the aim of this paper is to propose a method to address this issue. To do so, car body acceleration is determined using a 10 DOF simulat...
متن کاملGenerating and Focusing the Ultrasound Waves Using Elastomer-based Capacitive Micro-Speakers
Ultrasound wave is a kind of waves with the frequency higher than the human audible frequency. Although ultrasound was first used for military identification purposes, it has been used for decades for various other applications, especially medical applications. Medical applications of ultrasound include diagnostic and therapeutic applications, such as for the treatment of cancer. In this paper,...
متن کاملA Comparative Study of Gender and Age Classification in Speech Signals
Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...
متن کاملA visually-guided microphone array for automatic speech transcription
An integrated, modular real-time microphone array system has been implemented to detect, track and extract speech from a person in a realistic office environment. Multimodal integration, whereby audio and visual information are used together to detect and track the speaker, is examined to determine comparative advantages over unimodal processing. An extensive quantitative comparison is also per...
متن کاملMeeting Recording System via Multimodal Sensing
In this paper, we propose a recording system for a round-table meeting using microphone array as well as omnidirectional video camera. These equipments are located at the center of the table, and record all the activity during a meeting. They are also used for estimating the directions of speakers, and such data is exploited for reproducing the frontal image of the speaker from omnidirectional ...
متن کامل